Skip to main content

Geocities Datasets

Sub-collection of Web Archive Datasets


rss RSS

1
RESULTS


Show sorted alphabetically

Show sorted alphabetically

SHOW DETAILS
up-solid down-solid
eye
Title
Date Archived
Creator
Geocities Datasets
by Nick Ruest
data

eye 93

favorite 3

comment 0

Web archive derivatives of the GeoCities collection from the Internet Archive. The derivatives were created with the Archives Unleashed Toolkit . The geocities-aut-parquet-derivatives.xz file, once extracted, produces a directory for each derivative in the Apache Parquet format , which is a columnar storage format . Similarly, the geocities-aut-csv-derivatives.xz file, produces a directory for each derivative in the CSV format. These derivatives are generally small enough to work with on your...
Topics: csv, parquet, gephi, apache spark